Search CORE

15 research outputs found

Automatic Ontology Generation Based On Semantic Audio Analysis

Author: Kolozali Sefki
Publication venue: 'Queen Mary University of London'
Publication date: 17/03/2014
Field of study

PhDOntologies provide an explicit conceptualisation of a domain and a uniform framework that represents domain knowledge in a machine interpretable format. The Semantic Web heavily relies on ontologies to provide well-defined meaning and support for automated services based on the description of semantics. However, considering the open, evolving and decentralised nature of the SemanticWeb – though many ontology engineering tools have been developed over the last decade – it can be a laborious and challenging task to deal with manual annotation, hierarchical structuring and organisation of data as well as maintenance of previously designed ontology structures. For these reasons, we investigate how to facilitate the process of ontology construction using semantic audio analysis. The work presented in this thesis contributes to solving the problems of knowledge acquisition and manual construction of ontologies. We develop a hybrid system that involves a formal method of automatic ontology generation for web-based audio signal processing applications. The proposed system uses timbre features extracted from audio recordings of various musical instruments. The proposed system is evaluated using a database of isolated notes and melodic phrases recorded in neutral conditions, and we make a detailed comparison between musical instrument recognition models to investigate their effects on the automatic ontology generation system. Finally, the automatically-generated musical instrument ontologies are evaluated in comparison with the terminology and hierarchical structure of the Hornbostel and Sachs organology system. We show that the proposed system is applicable in multi-disciplinary fields that deal with knowledge management and knowledge representation issues.Fundings from EPSRC, OMRAS-2 and NEMA projects

Queen Mary Research Online

A Deep Multi-View Learning Framework for City Event Extraction from Twitter Data Streams

Author: Barnaghi Payam
Farajidavar Nazli
Kolozali Sefki
Publication venue: 'Center for Open Science'
Publication date: 28/05/2017
Field of study

Cities have been a thriving place for citizens over the centuries due to their complex infrastructure. The emergence of the Cyber-Physical-Social Systems (CPSS) and context-aware technologies boost a growing interest in analysing, extracting and eventually understanding city events which subsequently can be utilised to leverage the citizen observations of their cities. In this paper, we investigate the feasibility of using Twitter textual streams for extracting city events. We propose a hierarchical multi-view deep learning approach to contextualise citizen observations of various city systems and services. Our goal has been to build a flexible architecture that can learn representations useful for tasks, thus avoiding excessive task-specific feature engineering. We apply our approach on a real-world dataset consisting of event reports and tweets of over four months from San Francisco Bay Area dataset and additional datasets collected from London. The results of our evaluations show that our proposed solution outperforms the existing models and can be used for extracting city related events with an averaged accuracy of 81% over all classes. To further evaluate the impact of our Twitter event extraction model, we have used two sources of authorised reports through collecting road traffic disruptions data from Transport for London API, and parsing the Time Out London website for sociocultural events. The analysis showed that 49.5% of the Twitter traffic comments are reported approximately five hours prior to the authorities official records. Moreover, we discovered that amongst the scheduled sociocultural event topics; tweets reporting transportation, cultural and social events are 31.75% more likely to influence the distribution of the Twitter comments than sport, weather and crime topics

University of Essex Research Repository

arXiv.org e-Print Archive

An Inception-Residual-Based Architecture with Multi-Objective Loss for Detecting Respiratory Anomalies

Author: Jarchi Delaram
Kolozali Sefki
Ngo Dat
Pham Lam
Phan Huy
Tran Minh
Publication venue
Publication date: 19/06/2023
Field of study

This paper presents a deep learning system applied for detecting anomalies from respiratory sound recordings. Initially, our system begins with audio feature extraction using Gammatone and Continuous Wavelet transformation. This step aims to transform the respiratory sound input into a two-dimensional spectrogram where both spectral and temporal features are presented. Then, our proposed system integrates Inception-residual-based backbone models combined with multi-head attention and multi-objective loss to classify respiratory anomalies. Instead of applying a simple concatenation approach by combining results from various spectrograms, we propose a Linear combination, which has the ability to regulate equally the contribution of each individual spectrogram throughout the training process. To evaluate the performance, we conducted experiments over the benchmark dataset of SPRSound (The Open-Source SJTU Paediatric Respiratory Sound) proposed by the IEEE BioCAS 2022 challenge. As regards the Score computed by an average between the average score and harmonic score, our proposed system gained significant improvements of 9.7%, 15.8%, 17.8%, and 16.1% in Task 1-1, Task 1-2, Task 2-1, and Task 2-2, respectively, compared to the challenge baseline system. Notably, we achieved the Top-1 performance in Task 2-1 and Task 2-2 with the highest Score of 74.5% and 53.9%, respectively

arXiv.org e-Print Archive

Audio-Based Deep Learning Frameworks for Detecting COVID-19

Author: Hoang Truong
Jarchi Delaram
Kolozali Sefki
Ngo Dat
Pham Lam
Publication venue: IEEE
Publication date: 02/03/2022
Field of study

This paper evaluates a wide range of audio-based deep learning frameworks applied to the breathing, cough, and speech sounds for detecting COVID-19. In general, the audio recording inputs are transformed into low-level spectrogram features, then they are fed into pre-trained deep learning models to extract high-level embedding features. Next, the dimension of these high-level embedding features are reduced before fine-tuning using Light Gradient Boosting Machine (LightGBM) as a back-end classification. Our experiments on the Second DiCOVA Challenge achieved the highest Area Under the Curve (AUC), F1 score, sensitivity score, and specificity score of 89.03%, 64.41%, 63.33%, and 95.13%, respectively. Based on these scores, our method outperforms the state-of-the-art systems, and improves the challenge baseline by 4.33%, 6.00% and 8.33% in terms of AUC, F1 score and sensitivity score, respectively

University of Essex Research Repository

arXiv.org e-Print Archive

Using Machine Learning for Anomaly Detection on a System-on-Chip under Gamma Radiation

Author: Ehsan Shoaib
Kasap Server
Kolozali Sefki
McDonald-Maier Klaus
Wachter Eduardo Weber
Zhai Xiaojun
Publication venue
Publication date: 05/01/2022
Field of study

The emergence of new nanoscale technologies has imposed significant challenges to designing reliable electronic systems in radiation environments. A few types of radiation like Total Ionizing Dose (TID) effects often cause permanent damages on such nanoscale electronic devices, and current state-of-the-art technologies to tackle TID make use of expensive radiation-hardened devices. This paper focuses on a novel and different approach: using machine learning algorithms on consumer electronic level Field Programmable Gate Arrays (FPGAs) to tackle TID effects and monitor them to replace before they stop working. This condition has a research challenge to anticipate when the board results in a total failure due to TID effects. We observed internal measurements of the FPGA boards under gamma radiation and used three different anomaly detection machine learning (ML) algorithms to detect anomalies in the sensor measurements in a gamma-radiated environment. The statistical results show a highly significant relationship between the gamma radiation exposure levels and the board measurements. Moreover, our anomaly detection results have shown that a One-Class Support Vector Machine with Radial Basis Function Kernel has an average Recall score of 0.95. Also, all anomalies can be detected before the boards stop working

arXiv.org e-Print Archive

University of Essex Research Repository

Southampton (e-Prints Soton)

Warwick Research Archives Portal Repository

Coventry University Pure Portal

Explainable Early Prediction of Gestational Diabetes Biomarkers by Combining Medical Background and Wearable Devices: A Pilot Study with a Cohort Group in South Africa

Author: Fasli Maria
Kolozali Sefki
Norris Shane
van Heerden Alastair
White Sara L
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/01/2024
Field of study

This study aims to explore the potential of Internet of Things (IoT) devices and explainable Artificial Intelligence (AI) techniques in predicting biomarker values associated with GDM when measured 13 - 16 weeks prior to diagnosis. We developed a system that forecasts biomarkers such as LDL, HDL, triglycerides, cholesterol, HbA1c, and results from the Oral Glucose Tolerance Test (OGTT) including fasting glucose, 1-hour, and 2-hour post- load glucose values. These biomarker values are predicted based on sensory measurements collected around week 12 of pregnancy, including continuous glucose levels, short physical movement recordings, and medical background information. To the best of our knowledge, this is the first study to forecast GDM-associated biomarker values 13 to 16 weeks prior to the GDM screening test, using continuous glucose monitoring devices, a wristband for activity detection, and medical background data. We applied machine learning models, specifically Decision Tree and Random Forest Regressors, along with Coupled-Matrix Tensor Factorisation (CMTF) and Elastic Net techniques, examining all possible combinations of these methods across different data modalities. The results demonstrated good performance for most biomarkers. On average, the models achieved Mean Squared Error (MSE) between 0.29 and 0.42 and Mean Absolute Error (MAE) between 0.23 and 0.45 for biomarkers like HDL, LDL, cholesterol, and HbA1c. For the OGTT glucose values, the average MSE ranged from 0.95 to 2.44, and the average MAE ranged from 0.72 to 0.91. Additionally, the utilisation of CMTF with Alternating Least Squares technique yielded slightly better results (0.16 MSE and 0.07 MAE on average) compared to the well-known Elastic Net feature se- lection technique. While our study was conducted with a limited cohort in South Africa, our findings offer promising indications regarding the potential for predicting biomarker values in pregnant women through the integration of wearable devices and medical background data in the analysis. Nevertheless, further validation on a larger, more diverse cohort is imperative to substantiate these encouraging results

University of Essex Research Repository

Early Detection of COPD Patients’ Symptoms with Personal Environmental Sensors: A Remote Sensing Framework using Probabilistic Latent Component Analysis with Linear Dynamic Systems

Author: Barratt Benjamin
Chatzidiakou Lia
Kelly Frank
Kolozali Sefki
Quint Jennifer K
Rones Roderic
Publication venue: Springer
Publication date: 01/08/2023
Field of study

In this study, we present a cohort study involving 106 COPD patients using portable environmen- tal sensor nodes with attached air pollution sensors and activity-related sensors, as well as daily symptom records and peak flow measurements to monitor patients’ activity and personal expo- sure to air pollution. This is the first study which attempts to predict COPD symptoms based on personal air pollution exposure. We developed a system that can detect COPD patients’ symp- toms one day in advance of symptoms appearing. We proposed using the Probabilistic Latent Component Analysis (PLCA) model based on 3-dimensional and 4-dimensional spectral dictionary tensors for personalised and population monitoring, respectively. The model is combined with Lin- ear Dynamic Systems (LDS) to track the patients’ symptoms. We compared the performance of PLCA and PLCA-LDS models against Random Forest models in the identification of COPD patients’ symptoms, since tree based classifiers were used for remote monitoring of COPD patients in literature. We found that there was a significant difference between the classifiers, symptoms and the per- sonalised versus population factors. Our results show that the proposed PLCA-LDS-3D model outperformed the PLCA and the RF models between 4% and 20% on average. When we used only air pollutants as input, the PLCA-LDS-3D forecasting results in personalised and popula- tion models were 48.67% and 36.33% accuracy for worsening of lung capacity and 38.67% and 19% accuracy for exacerbation of COPD patients’ symptoms, respectively. We have shown that indicators of the quality of an individual’s environment, specifically air pollutants, are as good predictors of the worsening of respiratory symptoms in COPD patients as a direct measurement

University of Essex Research Repository

Recruitment of patients with Chronic Obstructive Pulmonary Disease (COPD) from the Clinical Practice Research Datalink (CPRD) for research.

Author: Barratt Benjamin
Beevers Sean
Chatzidiakou Lia
Hashmi Maimoona
Jones Roderic
Kelly Frank
Kolozali Sefki
Lewis Adam
Moore Elisabeth
Quint Jennifer K
Smeeth Liam
Sultana Kirin
Wright Mark
Publication venue: NPJ Prim Care Respir Med
Publication date: 25/05/2018
Field of study

Databases of electronic health records (EHR) are not only a valuable source of data for health research but have also recently been used as a medium through which potential study participants can be screened, located and approached to take part in research. The aim was to assess whether it is feasible and practical to screen, locate and approach patients to take part in research through the Clinical Practice Research Datalink (CPRD). This is a cohort study in primary care. The CPRD anonymised EHR database was searched to screen patients with Chronic Obstructive Pulmonary Disease (COPD) to take part in a research study. The potential participants were contacted via their General Practitioner (GP) who confirmed their eligibility. Eighty two practices across Greater London were invited to the study. Twenty-six (31.7%) practices consented to participate resulting in a pre-screened list of 988 patients. Of these, 632 (63.7%) were confirmed as eligible following the GP review. Two hundred twenty seven (36%) response forms were received by the study team; 79 (34.8%) responded 'yes' (i.e., they wanted to be contacted by the research assistant for more information and to talk about enrolling in the study), and 148 (65.2%) declined participation. This study has shown that it is possible to use EHR databases such as CPRD to screen, locate and recruit participants for research. This method provides access to a cohort of patients while minimising input needed by GPs and allows researchers to examine healthcare usage and disease burden in more detail and in real-life settings

University of Essex Research Repository

LSHTM Research Online

Spiral - Imperial College Digital Repository

Apollo (Cambridge)

King's Research Portal

Brunel University Research Archive

Best Practices for Publishing, Retrieving, and Using Spatial Data on the Web

Author: Atemezing Ghislain A
Atkinson Rob
Barnaghi Payam
Cochrane Byron
Fathy Yasmin
García-Castro Raúl
Haller Armin
Harth Andreas
Janowicz Krzysztof
Kolozali Sefki
Le Phuoc Danh
Lefrançois Maxime
Lieberman Josh
Perego Andrea
Roberts Bill
Tandy Jeremy
Taylor Kerry
Troncy Raphael
van den Brink Linda
van Leeuwen Bart
Publication venue: 'IOS Press'
Publication date: 01/01/2018
Field of study

Data owners are creating an ever richer set of information resources online, and these are being used for more and more applications. With the rapid growth of connected embedded devices, GPS-enabled mobile devices, and various organizations that publish their location-based data (i.e., weather and traffic services), maps and geographical and spatial information (i.e., GIS and open maps), spatial data on the Web is becoming ubiquitous and voluminous. However, the heterogeneity of the available spatial data, as well as some challenges related to spatial data in particular make it difficult for data users, web applications and services to discover, interpret and use the information in large and distributed web systems. This paper summarizes some of the efforts that have been undertaken in the joint W3C/OGC Working Group on Spatial Data on the Web, in particular the effort to describe the best practices for publishing spatial data on the Web. This paper presents the set of principles that guide the selection of these best practices, describes best practices that are employed to enable publishing, discovery and retrieving (querying) this type of data on the Web, and identifies some areas where a best practice has not yet emerged

University of Essex Research Repository

JRC Publications Repository

Automatic Ontology Generation for Musical Instruments Based on Audio Analysis

Author: Gyorgy Fazekas
Mark Sandler
Mathieu Barthet
Sefki Kolozali
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref